The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources

نویسندگان

  • Keith Vertanen
  • Per Ola Kristensson
چکیده

Augmented and alternative communication (AAC) devices enable users with certain communication disabilities to participate in everyday conversations. Such devices often rely on statistical language models to improve text entry by offering word predictions. These predictions can be improved if the language model is trained on data that closely reflects the style of the users’ intended communications. Unfortunately, there is no large dataset consisting of genuine AAC messages. In this paper we demonstrate how we can crowdsource the creation of a large set of fictional AAC messages. We show that these messages model conversational AAC better than the currently used datasets based on telephone conversations or newswire text. We leverage our crowdsourced messages to intelligently select sentences from much larger sets of Twitter, blog and Usenet data. Compared to a model trained only on telephone transcripts, our best performing model reduced perplexity on three test sets of AAC-like communications by 60– 82% relative. This translated to a potential keystroke savings in a predictive keyboard interface of 5–11%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A crowdsourcing method to develop virtual human conversational agents

Educators in medicine, psychology, and the military want to provide their students with interpersonal skills practice. Virtual humans offer structured learning of interview skills, can facilitate learning about unusual conditions, and are always available. However, the creation of virtual humans with the ability to understand and respond to natural language requires costly engineering by conver...

متن کامل

Tag Questions in Persian: Investigating the Conversational Functions

This article intends to identify the use and typify the functions of tag questions (TQs) in Persian everyday conversations and dialogic interaction.  The analyses were made based on two data sources:  A documentary film titled Commander in which the participants are engaged in free interactions, and an audio-recorded instrument named CALLFRIEND which consists of Iranian native...

متن کامل

The Relationship between Self-esteem and Conversational Dominance of Iranian EFL Learners’ Speaking

The crucial role of affective factors like anxiety, inhibition, motivation and self-esteem have long been of interest in the field of language learning due to their enormous association with the cognitive processes involved in performance in a second or foreign language. This study aimed at investigating the relationship between Iranian EFL learners’ self-esteem and conversational dominance in ...

متن کامل

Perform Three Data Mining Tasks with Crowdsourcing Process

For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...

متن کامل

A Conversational Movie Search System Based on Conditional Random Fields

Online streaming companies such as Netflix have become dominant in the media distribution sector. However, such media delivery services often support very rudimentary search, especially for natural language queries. To provide a more natural search interface, we have developed a conversational movie search system, which parses the recognition hypothesis of a spoken query into semantic classes u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011